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1. 


INTRODUCTION 


The  purpose  of  this  report  is  to  present  a  shorthand  notation  for  matrix  mani¬ 
pulation  and  formulae  of  differentiation  for  matrix  quantities.  The  shorthand  and 
formulae  are  especially  useful  whenever  one  deals  with  the  analysis  and  control  of 
dynamical  systems  which  are  described  by  matrix  differential  equations.  There  are 
other  areas  of  application  but  th_  control  of  matrix  differential  equations  provided 
the  motivation  for  this  study.  References  [  1  ]  through  [6]  deal  with  the  analysis  and 
control  of  dynamical  systems  which  are  described  by  matrix  differential  equations. 

Much  of  the  material  presented  in  this  report  is  available  elsewhere  in  different 
forms;  it  is  summarized  herein  for  the  sake  of  convenience.  Two  references  were 
used  extensively  for  the  mathematical  background;  these  are  Bodewig  (Reference  [7]) 
and  Bellman,  (Reference  [  8  ] ). 

The  organization  of  the  report  is  as  follows;  In  Section  2  we  present  the  defi¬ 
nitions  of  the  unit  vectors  e.  and  of  the  unit  matrices  E  ...  In  Section  3  we  indicate 

-i  -ij 

the  use  of  the  matrices  E..  as  basis  in  the  space  of  n  x  n  matrices.  In  Section  4  we 

-ij 

present  several  relations  which  can  be  used  to  decompose  a  given  matrix  into  its 
column  and  row  vectors.  Section  5  deals  with  operations  involving  the  unit  vectors 
e.  and  the  unit  matrices  E...  In  Section  6  we  show  how  the  trace  function  can  be  used 

-i  -ij 

to  represent  the  scalar  product  of  two  matrices.  In  Section  7  we  define  the  differen¬ 
tials  of  a  vector  and  of  a  matrix  and  we  also  define  the  motion  of  a  gradient  matrix. 
Section  8  contains  a  variety  of  formulae  for  the  gradient  matrix  of  trace  functions. 
Section  9  contains  relations  for  the  gradient  matrix  of  determinant  functions. 

Section  10  contains  relations  involving  partitioned  matrices.  A  table  summarizing 
the  gradient  formulae  of  Sections  8  and  9  is  also  provided. 

2.  NOTATION 

Throughout  this  report  column  vectors  will  be  denoted  by  underlined  letters  and 
matrices  by  underlined  capital  letters.  The  prime  (')  will  denote  transposition. 
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A  column  vector  v.  with  components  v^,  v^, . .  . ,  is 


<2.  i> 


In  particular,  the  unit  vectors  e  .,e0, . . . ,  e  are  defined  as  follows: 
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An  n  x  m  matrix  A  with  elements  a.^  (i  =  1,  2 . n  ;  j  =  1,  2 


m)  is  denoted  by 
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If  m  =  n,  then  A  is  square.  If  A  =  A'  then  A  is  symmetric  . 

The  unit  matrices  E  .  are  square  matrices  such  that  all  their  elements  are  zero, 

-y 

except  the  one  located  at  the  i-th  row  and  j-th  column  which  is  unity.  For  example, 
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0  1  0  .  .  .  0 

o  n  o  .  ,  .  o 

0  0  0  .  .  .  0 


(2.4) 


The  unit  matrix  F,  .  is  related  to  the  unit  vectors  c.  and  e  as  follows: 
-ij  -i  ~J 


E..  =  e.  e'  .  (2.  5) 

“»J  ”J 


The  identity  matrix  I  , 


1  0  .  .  .  0 

0  1  .  .  .  0 


(2.  6) 


n 


can  thus  be  written 


n  n 


I  = 


i=l  i=l 


(2.  7) 


The  one  vector  e  is  defined  by 


i=l 


/  o 

3; 
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The  one  matrix  K  ts  defined  by 


E  = 


1  1 

1  i 

1  1 


•  •  • 


=  e  e 
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The  trace  of  an  n  x  n  matrix  A  is  defined  by 


tr  [A] 


The  trace  has  the  very  useful  properties 

tr[  A  +  BJ  =  tr(  A 1  +tr[B] 
tr[  A  B]  =  tr[B  A)  . 

The  determinant  of  an  n  x  n  matrix  A  will  be  denoted  by 


det  [  A 1  , 


(2.  9) 


(2.  10) 


(2.11) 

(2.12) 


T  SPACES 

We  shall  denote  by 

:  the  set  of  all  real  column  vectors  v  with  n  components  v^.v^,  ■  •  •  *v 

M  :  the  set  of  all  real  nX  n  matrices  . 
nn 

Both  R  and  M  are  linear  vector  spaces, 
n  iin 

The  unit  vectors  £p£2’  •  •  •  (see  Eq.  (2.  2))  belong  to  and,  furthermore, 
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'orm  ;t  basis  in  R  .  Thus,  every  v  c  R  e;m  lie  represented  bv 
n  —  11 
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Siinilnrly  the  unit  matrices  b  .  beioiiL  to  M  ;uul  they  term  s  basis  m  M  ,  Thus, 

-ij  nn  nii 

every  n  n  matrix  Af  M  can  Lie  represented  by  (a.  are  the  elements  of  A  } 

-  nil  ij  — 


n  n 


A-V  V  ,.e 

-  L>  'j  -ij 

i- 1  j  =  l 


a.  2) 


The  dimension  of  R  is  n  and  the  dimension  of  M  is  n  . 

a  nn 

Note  that  the  transpose  A'  of  A  ean  lie  written  as 


n  n 

A'  =  '>  y  a  .  iu  . 

Lj  4/  ji  -ij 
i-  l  j=l 


<3.  3) 


4.  SO  MM  USEFUL  DECOMPOSITIONS  Ob  A  MATRIX 


In  this  section  we  shall  develop  certain  formulae  relating  a  matrix,  its  elements, 
and  its  row  and  column  vectors. 

If  A  is  an  n  *  n  matrix  we  shall  denote  its  row  vectors  by  «»,*,  a.,*,  . .  . ,  a  A  and 
it:.s  column  veei  ovs  by  T*  ••  •  >'1+n*  H'UH 
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We  emphasize  that  both  types  of  vectors  jp*  and  are  column  vectors, 

We  shall  now  indicate  how  one  ran  write  the  elements  a.,  and  the  row  and  column 

1J 

vectors  of  a  matrix  A  in  terms  of  A  and  in  terms  of  the  unit  vectors  (see  Eq.  (2.2)) 


a..  =  ef  A  e 
ij  *~i  --3 

=  A'c  . 

~J  i 

(4.3) 

a.  *-  A'  c, 

■n  —  H. 

(the  transpose  of  the  i-th  row  of  A  ) 

(4.4) 

a'* -  ef  A 
-n  — t  — 

(the  i-th  row  vector  of  A) 

(4.5) 

a.  —Ac 

”*j - J 

(the  j  -th  column  vector  of  A  )  . 

(4.6) 

The  element  a.,  can  also  be  generated  as  follows: 


;t 

1J 


or 


a  . 
ij 


(4.7) 


(4.8) 


Next  we  shall  indicate 
elements  of  A.  From  Eqs. 


the  relation  of  the  row  and  column  vectors  of  A  to  the 
(4.3),  (4.4),  (4.5),  and  (4.6)  we  deduce  that 
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■it, 
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(4.  9) 


(4.  10) 


(4.  11) 


The  matrix  A  can  be  generated  as  follows:  from  Eqs.  (2.5)  and  (3.  2)  we  have 


n  n 


n  u 


±•11  ^=1 


1=4  j=l 


i=l  j=l 


(4.12) 


From  Eqs.  (4.  12),  (4.  9),  (4.  10)  and  (4.  11)  we  obtain 


n 


£.s\  ilii' 


1=1 


(4,13) 


and 


n 

A  =  ^  a*,  of  . 

“  U  -  j  -j 

j=l 


(4.14) 


5.  FORMULAE  INVOLVING  THE  UNIT  MATRICES  H.. 

First  of  all  if  we  define  the  Kronecker  delta  <5.. 

hi 


FI  if  i  -  j 

1.0  if  i  f  j 


(5.  1) 
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then  we  have  the  relation 


ef  e.  =  ef  e,  =  6. .  .  (5.  2) 

— t  -]  “]  “1  ij 


The  following  two  relations  relate  operations  between  unit  matrices  and  unit  vectors 
{see  Eq.  (2.  5)) 


E..  e,  =  e.  ef  e.  =  5.,  e. 

—  ij  — k  —i  — ]  — k  jk  — i 

(5.3) 

ef  E..  =  ef  e.  c'  =  6  .  ef 
-k  -  ij  -k  -l  -j  ki  -] 

(5.4) 

The  following  relations  relate  unit  matrices 

E.,  E  =  e.  ef  e  e'  =  6..  e.ef  =  <5  E,  . 

— ij  —km  — i  —  j  — k— m  jk— i—m  jk  —  ire 

(5.  5) 

It  follows  that 

E..  E..  =  E2.  =  6. .  E..  =  6..E.. 

-lj-xj  -ij  Ji  —  ij  lj-ij 

(5.6) 

E..  E.,  =  $..  E.,  =  E 
-ij -jk  JJ-ik  -ik 

(5.  7) 

E..  E..  =  E.. 

-ij  -jx  -ii 

(5,.  8) 

E°!  =  E..  ;  « =1,2,... 

— li  —xi 

(5.9) 

E..  E  E  =  E  E  =  E. 

— ij  — jk  —km  -ik— km  —  mi 

(5. 10) 

Equation  (5.  ID)  generalizes  to 

E.  .  E.  .  E.  ,  .  .  .  E.  .  -  E.  . 

”V2~i213  V4  Vp 

(5.  11) 
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We  shall  next  consider  the  matrix  E.  A.  From  Eqs.  (3.2),  (4.  10),  and  (5.5) 

-ij  " 

we  establish  that 


n  n  n  n 

E  A  =  K  N  \  a  K  ,  =  )  S  a  „  E  E  „ 
-t.l-  ~ij  Lj  Lj  np  -  ij  -or/3 

«'  =  1  j3=i  «?=1  /3--1 


n  n 


(5.  12) 


or=l  j3=l 


/?=! 


/3=1 


which  reduces  to  (in  view  of  Eq.  (4. 10)) 


E..  A 

-ij  “ 


(5. 13) 


Similarly  we  can  establish  that 


A  E.. 

- ij 


and  that 


H..  A  K  =  a.,  H. 

—  lj - km  jk  -  an 


(5.  14) 


(5.15) 


6.  INNER  PRODUCTS  AND  THE  TRACE  FUNCTION 


Suppose  that  v  and  w  arc  n-vectors  (elements  ol  R  )  ;  then  the  common  scalar 


product 


(v,  w)  -  v^w  =  w'v  =  ^  v.w. 

i=l 


(b.  1) 


is  an  inner  product. 

In  an  analogous  manner  we  define  an  inner  product  between  two  matrices.  Let 

us  suppose  that  A  and  B  r  with  elements  a  and  b„  respectively,  are  elements  of 

M  .  It  can  be  shown  that  the  maoping 
nn 


(6.2) 


n  n 


<_,B)  =  tr[AR<]  b 


i=l  j=l 

has  all  the  properties  of  an  inner  product  because 


tr[A  B']  =  tr[B  A'] 


(6.  3) 


tr  [  A  B'  ]  =  r  trf  A  B'  ]  (  r  :  real  scalar)r 


(6.4) 


tr[(A+B)  C']  =  tr[ AC']  +  tr[BC'j  . 


(6.5) 


We  shall  present  below  some  interesting  properties  of  the  trace.  Since 


trf  ^  1  au 


(6.6) 


and  since  (see  Eq.  (4. 3)) 


a,.  =  e'  A  e., 

n  —i - i 


(6.7) 


tr[  A 


=  /  e'  A  c. 

Lj  —  i - 1 


(6.8) 


From  Kqs.  (6.8),  (4.5),  and  (4.  6)  we  also  obtain 


tr[A  ]  =  /jl\  a*. 


(6.  9) 


tr[A]  =  ^a'*e.. 


(6.  10) 
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Now  we  shall  consider  tr[  A  B].  From  Eq.  (6.  8)  we  have 


n 

trl  A  B  1  =')  e'  A  B  e. 

-  —  1 - 1 

i=  1 


(6.11) 


We  can  also  express  the  tr[  A  B  ]  in  terms  of  the  column  and  row  vectors  of  A  and  B. 
From  Eqs.  (6.  11),  (4.5)  and  (4.6)  we  have 

n 


trf  AB  ] 


(6.  12) 


i=l 


Since  (see  Eq.  (2.  12)) 

we  obtain  similarly 


t r[  A  B  ]  =  tr[B  A  ] 


n 


tr[AB  ]  a*. 


i=l 


(6.  12) 


(6.14) 


and  that 


Similarly  we  deduce  that 


tr[  A  B 


n  n 


Lj  Zl  aik  ^ki 
i=l  j=l 


n 

tr[  A  B'l  =  'S  a'  b.* 

-  Z_  —  — l* 

i=l 

n 

trfAB'l  =Na',.  b,.  . 
i=l 


(6.15) 


(6.16) 


(6.  17) 


Another  very  interesting  formula  is  the  following.  Let  v_  and  w  be  two  column 
vectors;  then  v  wf  and  w  v#  are  n  *  n  matrices.  Hence,  by  Eq.  (6.  8), 


11 


(6.  18) 


But 


and,  so. 


Since 


tr[v  w'  ]  =.w' 
tr[  w  v#  ]  =  v  ■ '  w 


and, so, 


tr[v  w']  =tr[w  v']  . 


Next  we  consider  tr[A  BC  ] .  From  Eq.  (6.  8)  we  have 


n 


tr[  A  B  C  ]  e'.  ABCe.  . 

i=l 


It  follows  that 


Since 


(6.  19) 

(6.  20) 

(6.21) 
(6.  22) 

(6.23) 

(6.24) 


(6.25) 


(6.  26) 
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we  can  also  deduce  that 


n  ? 

tr[  A  B  C  ]  =  'z  a'.*  e.  b%  c*. 

L-!  1  “J  —  J  1 

i=i  j=l 


Additional  relationships  can  be  derived  using  the  equations 


tr[  A  R  C  ]  =  tr[  B  C  A  ]  =  tr[  CAB]. 


<6.  27) 


(6.  28) 


7.  DIFFERENTIALS  AND  tiR  ADIHNT  MATRICES 


The  relations  which  we  have  established  will  be  used  to  develop  compact  nota¬ 
tions  for  differentiation  of  matrix  quantities. 

Let  x  be  a  column  vector  with  components  x  ,  x„, . . . ,  x  .  Then  the  differentia] 

14*  n 

dx  of  x  is  simply 


dx 


r 


dx 

dx 


1 

2 


(7.  1) 


dx 


n 


Now  let  f(  • )  be  a  scalar  real  valued  function  so  that 

«*>  . V- 


The  gradient  vector  of  f(  • )  with  respect  to  x  is  defined  as 


13 


(7.2) 


For  example,  suppose  that  n- 2,  and  that 


2  12 

f(x)  =  f(x1,x2)  =  3\i  +*i  x2  +  —  x  . 


Then 


Now  let  X  be  an  n  x  n  matrix  with  elements  x. .  (i,  j  =  1,  2, . , . ,  n).  The 
differential  dX  of  X  is  an  n  *  n  matrix  such  that 


(7.3) 


Note  that  the  usual  rules  prevail: 
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(a  :  scalar) 


(7.4) 


d(aX  )  =  a  d  X 

d(X +  Y)  =  dX  +  dY  (7.  5) 

d(X  Y  )  =  (dX)  Y  +  X(dY)  .  (7.  6) 


From  (7.6)  we  can  obtain  the  useful  formula  developed  below.  Suppose  that 


so  that 


and,  so, 


X  =  Y_1  (7.7) 

X  Y-l  (the  identity  matrix)  (7.  8) 


(dX)  Y  +X  (dY)  =  dl_  =  0  . 


It  follows  that 


dX  =  -  X  (dY )  Y 

and  that 

d(Y_1)  =  -  Y_1(dY)  Y 


(7.  9) 


(7. 10) 


(7.11) 


Next  we  consider  the  concept  of  the  gradient  matrix.  Let  X  be  an  n  *  n  matrix 

with  elements  x.  ..  Let  f(  • )  be  a  scalar,  real -valued  function  of  the  x..,  i.e. 

ij  U 


f(X)  =  f(xir 


'  ,Xln"X21’ 


We  can  compute  the  partial  derivatives 


(7.  12) 


0f(X) 

3x. . 

ij 


j  =  1»  2, 


n  . 


(7.  13) 
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we-JB—T 


9f(X ) 

We  define  an  n  x  n  matrix  — — —  ,  called  the  gradient  matrix  of  f(X )  with 


X,  as  the  matrix  whose  ij-th  element  is  given  by  (7.  13).  We  can  use  Eq. 
precisely  define  the  gradient  matrix  as  follows: 


9f(X) 


=  S  e. 


8f(X) 


ax  ;;  — i  Bx..  —  j 

—  lj 


or,  from  Ec|.(d.2),  to  write 


9f(X)  _  _  9f(X)  , 

"  ij 

For  example,  suppose  that  X  is  a  2  x  2  matrix  and  that 

f(X)=Xj1x21  +  41-X11X22X12+5x21  • 


Then 


3f(  X ) 

ax 


2x  1 1 X 2 1  '  X22X12 


X11  +  3X21+5 


'Xll  X22 


"Xll Xl2 


Suppose  that  the  elements  x_  of  X  represent  independent  variables 


Bx 


ap 


dx. 


1J 


if  c*  =  i  , 


0 

s 


otherwise 


respect  to 
(4.  12) to 

(7.14) 

(7.15) 


,  that  is 

(7. 16) 
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A  useful  formula  is  as  follows: 


d  UX 

—  x  --  —  --  K. 

i  lx  —  (lx  —  :\ 

__‘J _ iJ 


(7.  17) 


If  X  -  X  '  ,  i.e.  if  X  is  sy m mot  l  ie;  1  lie'll 
differential  dX  is  symmetric' and 


x 

|J 


x  .  for  all  i  and  j. 
Ji 


Clearly  ti  e 


dX  '■  dX# 


(7.  IK) 


(dX  )'  ■-  dX  . 


(7.  id) 


.S.  CRADII’.NT  MAT  KICKS  OF  TKACli  1-UNCTIONS 


In  this  section  we  shall  derive  formulae  which  are  useful  when  one  is  interested 
in  obtaining  the  gradient  matrix  of  the  trace  of  a  matrix  which  depends  upon  the  ma¬ 
trix  X.  Throughout  the  section,  we  shall  assume  that  X  is  an  n  x  n  matrix  witli 

elements  x..  such  that 
iJ 


1 


if  “’  =  i  , 


otherwise  . 


/i-j 


(d.  i) 


First,  we  shall  compute 


a 

iJX 


i  .-I  x  l 


(H.2) 


Since  the  diflereutial  and  the  trace  are  linear  operators  we  have 


d  tr[  X  ]  trj  dX  |  . 


1 1  dice,  in  view  of  (7,  17) 

d 


t  r[  X  ]  -  t  r 


dX 


dx 


'J  J 


«=  t  r  I  K. . 

—  1 1 


(«.■•) 


(H.4) 
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From  (7.  15)  and  (8.  4)  we  have 


But 


"IS,.  1 


(5  . 

*j 


It  follows  from  (8.  3)  and  (8.6)  that 


£  6 
ij  y 


£ 

“ij 


SB  . 
i-ii 


(8  5) 


(8.6) 


(8.7) 


In  viev  of  (2.  7)  we  conclude  that 


(8.8) 


Next  we  shall  compute  the  matrix 


a 


trf  AX]  . 


Proceeding  as  above  we  have: 


3x. . 

y 


r 

t:[  AX]  =  i:r|/» 


*^’i  A  K  ] 


dX 


(by  (7.17))  . 


(8.9) 


But 


ox  \  lrf- -l£#j 

-  II  ki  dxij  J 

(by  <7. 14)) 

=  It*  trj  A  !•;  1  e' 

ij  “i  “  ~ij  “J 

(by  (8.  10)) 

=  11  e .  e'  A  H  e  cf 
ijk~'“k - ij-k-J 

(by  (6.8)) 

=  1  K  A  H..  H,  , 
ijk  lk  “  ’J  k-i 

(by  (2.5)) 

=  —  E  A  6  E. . 
ijk-ik-  jk-i] 

(by  (5.5)) 

=  lE.Ali 

ij  "ij  ~  “ij 

=  2  a,  E  . 
ij  J1  -u 

(by  (5.  15) 

=  A' 

(by  (.1.  3)) 

Thus,  we  have  shown  that 


In  a  completely  analogous  manner  we  find  the  following 

t~u-|AX]  =A< 


(8.  ID) 


(K,  11) 


12) 


<8. 13) 


•—  tr[A  X  B]  =  A'B; 
a  a. 

trf  AX'B  j  =8  A  (8.14) 

iyj  tri  A  11  =  A  (8.  15) 


~  trf  AX']  =  A* 

~  trf  AX  B]  =  B  A 


(8.  16) 


(8.  17) 

(8.  18) 


A  useful  lemma  (which  was  proved  in  the  derivation  of  Eq.  (8. 10))  is  the  following: 
Lemma  8.  1 

If.  r~—  tr[  A  X  ]  ~  trf  AH..  ]  ,  then  -jrr  trf  A  X  j  =  A  ' . 

Next  we  turn  our  attention  to  the  derivation  of  gradient  matrices  of  trace 
functions  involving  quadratic  iorms  of  the  matrix  X. 

Consider 


Jr 'ns.2]  .  (8.») 

Since 

dtrfX2]  =  trfdX2]  =  trf  X  d  X  +•  (dX  ) X  ] 

=  trf  X  d  X  ]  +trf(dX)X] 

=  trf  X  d  Xj  +  trf  X  d  X  ]  =  2  trfXdX]  (8.20) 
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we  conclude  that 


0  2 
—  ”'i* 

ij 


=  2  tr 


'  dX  ' 

-  dx , . 


=  2  trfXH.,  K 
~~ij 


It  follows  from  Lemma  8.  1  t hat 


(8.21) 


(8  22) 


Li  a  similar  fashion  one  can  prove  that 


(8.28) 


Next  we  consider 


~ -  tr[  A  XBX] 

Since 


d  Lr[  /_  X  H  X]  -  tr[d(A  X  B  X)] 

=  lr[  A(dX)B  X  ]  +  t  r l  A  X  B(dX)] 
=  trf  B  X  A  (dX  )]  tr[  AX  B(dX)] 
=  tr|  (BXA+A  X  B)(dX)] 


(8.24) 


(8.25) 


we  conclude  that 


(8.20) 


Next  we  consider 


^trfAXBX'l.  (8.27) 

Since 

d  tr[AXB  X'l  =  tr[A(dX)B  X']  +  trf  A  X  B  (dX' )] 

=  tr[B  X'A(dX)]  +  trf(dX)'  A  X  B] 

=  tr[B  X'A(tiX  )]  +  tr[B'X'A' (dX)] 

=  tr[(BX'A  +  B'X'A')  (dX)]  (8.28) 


(because  (clX')  =  (dX)'  and  because  trf Y ]  =tr[Y#]forall  Y  ),  it  follows  that 


~  tr[  A  X  B  X']  =  A'X  B'  +  A  X  B 


(8.29) 


The  following  two  equations  involve  higher  powers  of  X  and  they  are  easy  to  derive 


9  .  r  vn.  .  ,  .n- 1  .  ,,n-l.  t 

Jx  trl  -  1  =  -  )  =  n(  X  ) 


(8.30) 


9  c  i  ..n  i  j  i  . . ii™  1  v  .  n  2  . .  2  ,  _ n ™™ 3 

~  tr[  A  X  1  =  (  A  X  +  X  A  X  +  X  AX  +■ 


+  Xn’2AX-i-  Xn_1  A  )' 


liquation  (8.  31)  can  also  be  written  as 


3Xtrt  -  -  1 


n- 1 


XiAXn_1_i 


(8.31) 


(8.32) 


i=0 


1 


The  two  formulae  above  provide  us  with  the  capability  of  solving  for  the  gradient 

matrices  of  trace  functions  of  polynomials  in  X  .  A  particular  function  of  interest  is 

X 

the  exponential  matrix  function  o'"  winch  is  commonly  defined  by  the  infinite  series 


X 


I  +  X  + 


1  •  “ 

2  • 


4- 


We  proceed  to  evaluate 


oc 


i=0 


(8.33) 


a  .(  x 

3X  1 1  f  e 


(8.34) 


Since 


X 

tr[e  ]-  tr 


)  -frtrtxh 

i=0 


we  can  use  l:q.(S,30)  to  find  that 


a  ,  x  x 
s5T  trl e  1 


(8.35) 


(8.3b) 


We  shall  next  compute 

~tr(X_1}  .  (8.37) 

First  recall  the  relation  (see  Fq.  (7.  11)) 

dX_1=  -X'VtfOX’1  .  (8.38) 

U  follows  LhaL 

d  trfX"1  |  =  trfdX"1  ]  =  -trl'x'ViJOX-1!  (8.39) 
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unci,  so. 


8  tr[X_1]  = 


3x, , 
ij 


=  -tr 


-1  d*  -1 

L-  —  * 

IJ 


-  trf  x"  1  E. ,  X" 1 1 

1  -  “11  ~  J 


- trfX- ''  E..  1  . 

-  “ij 


(8.40) 


From  Eq.  (8.  40)  and  Lemma  8.  1  we  conclude  that 


(8.41) 


in  a  similar  fashion  we  can  show  that 


(8.42) 


y.  GRADIENT  MATRICES  OF  DETERMINANT  FUNCTIONS 


The  trace  tr[X  ]  and  the  determinant.  det[X  ]  of  a  matrix  X  are  the  two  most 
used  scalar  functions  of  a  matrix.  In  the  previous  section  we  developed  relations  for 
the  gradient  matrix  of  trace  functions.  In  this  section  we  shall  develop  similar  rela¬ 
tions  for  the  gradient  matrix  of  determinant  functions. 

Before  commencing  the  computations  it  is  necessary  to  state  some  of  the 

properties  of  the  determinant  function.  Let  X  be  an  n  x  n  matrix.  Let  X ^  X^ . X^ 

be  the  eigenvalues  of  X  ;  for  simplicity  we  shall  assume  that  these  eigenvalues  are 
distinct.  It  is  always  true  that  the  trace  of  X  equals  to  the  sum  of  the  eigenvalues 
whiie  the  determinant  of  X  is  the  product  of  the  eigenvalues;  in  other  words, 


24 


trf  X  1  -  A  +  A..  +  •  •  •  +  A  (9.  1) 

1  - !  12  n 

detf  X  1  A  j  A.;  ...  X  .  (9.  2) 

The  determinant  has  the  following  properties: 


detfX  Y]  ^  detfX  ]  detf  Y  ] 

(9.  3) 

dot ['  X  +  Y  J  /  detfX  j  +  det[Y] 

(9.4) 

det[  I  ]  =  1 

(9.  5) 

detfX'1  ]  =  1/detfX  ] 

(9.  h) 

det|  XU  1  =  (det  X)U 

(9.  7) 

det|  X  ]  =  detf  X  *  ] 

(9.H) 

In  this  section  we  shall  use  A  to  denote  the  diagonal  matrix,  whose  diagonal 
is  formed  by  the  eigenvalues  of  X.  i.e. 


C I  ea  v  1  y 


\  " 


0  \.f  ...  0 


0  0 


(9.  9) 


tl'tA.1  r  Aj+  S  +  -'-  +  \ 

.let!  A  1  =  A  A  . .  .  A 
1  —  1  12  a 


(9.  10) 
(9.  11) 


and.  so. 


trfX]  =  trf  A  ] 
det[  X  ]  =  det[  A  ] . 


Using  the  differential  operator  we  have 


d(tr[X])  =  d(tr[  A]) 
d(det[X  ])  =  d(det[  A  ] )  . 


Now  we  compute  the  differential  of  det[X  ]  (provided  X  is  nonsingular) 


d(det[X  ] )=  d(A1  \2. ..  X  ) 

=  (dA^)  \2\r  .  .  An  +  A1  (dA  )  A.}. . .  X 

+  ••••+-  X.  X,.  ...  X  .  (dA  ) 

1  2  n-1  n 


=  (det[  A  1 ) 


+ 


+ 


n  J 


(9.  12) 
(9.  13) 


<9.  14) 
(9.  15) 


(9.  16) 


We  note  that  we  can  identify 


11  n 
v~  dA. 

1  -r- 

i«i  1 


=  tr[  A  d  A]  and,  so, 


in  view  of  (9.  13)  we  have 

d(det[  X  ] )  =  (def[  X  ] )  tr[  a"  1  d  A  ]  . 


(9.17) 
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We  shall  now  prove  the  following  lemma: 


l.ennnn  W  1  If  X  is  nonsin^nla  r  and  if  it  has  distinct  eigenvalues  A^,A,;1. 
t  hen 


tr|  A"  *ii  A  |  -  t  r|  X“  1  dX  ] 

Proof:  X  a  ml  A  are  related  by  the  similarity  transformation 


(9.  IK) 


Thus 


Prom  iiq.  (9.  10)  we  have 


anil,  so, 


P  A 


-  X  P 


(9.  19) 


(9.  20) 


(9.21) 


(dP)  A  +  P(d  A  )  =■  (ilX  )  P  4-  X  (d  P)  .  (0.  22) 


It  follows  that 

i|A  --  p”  (dX)  P  +  V~  1  X  (dP)  -  p"  '(dP)  A  .  (9.  2K) 


Prom  Pqs.  (9.  12),  (9.  20)  and  (9.  2d)  we  obtain 

A_  *  dA  '  P"  1  X  ‘  1  P  p"  '(dX  )  P  +  p"  1  x”  1  P  P"  'x  (dP)  (9.  24) 

-  2l" 1  X  “  1  P  P_  '  X 

=  if  'x"  ViX  )  P  +  P"  '(dP )  -  P"  '  Xf  !(dP)P_1X  P  . 
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Forming  the  t  r;u c  of  hot  h  sides  and  using  the  properties  (2.  11)  anil  (2.  12)  we  find 


rr[  A”  1ti  A  |  =  trfX^dX  1 


(9.  25) 


(v).  K.  1). 


Using  Kqs.  (4.  18)  and  (9.  Id)  we  arrive  at 


d(det[Xp  =  (del  [  X  1 )  t r[ X_~  dX  ] 


<9.  20) 


We  ean  now  compute  the  gradient,  matrix  of  detfXj,  i.  e.  the  matrix 


ox  dct'  ‘ 


(9.27) 


From  (°.  20)  we  have 


-r—  det[X  1  =  (del  f  XI)  h 
ij 

=  dcl[X]  1 


1 


—  dx 


ij 


(9.28) 


and,  so,  L.emma  8.  1  yields 


<9.  2d) 


If  we  write  liq.  (y.  20)  in  the  more  suggestive  form 


d(det[  X  ] )  , 

- =  tr[  X  dX] 

del  [  X  ] 


(9.  3D) 


t  Hquation  (9.20)  is  true  even  if  the  eigenvalues  are  not  distinct;  see  Kef .  [  7  ]  ,  p.  35. 


■\s 


we  ean  see  t hiit 


il(W»K  ilcf  [X])  -  tr[X*  *  iiX  1  (4 

and  I liat 

(a  most  useful  relation).  IJsiiio  the  property  (l). .'))  of  the  determinant  function  it  i ;■ 
easy  to  prove  that 

(ll 

Also,  it  is  easy  to  show  (in  view  of  (U.  N))  that 

l1' font  the  obvious  relation 

d(det  [XU  ]  =  d(det  IX])11  =  u(det  (  X  ] )“'  1  d(del  X  )  (S 
we  cone luile  that  Kq,  (y.  2<i)  yields 


d(det  [  x"  1  )*  n(det  [X  ])“  trjx'ViX  ]  (S 


10.  PARTITIONED  MATRICES 

It  is  often  necessary  to  work  with  partitioned  matrices.  The  following  formulae 
are  very  useful. 

Consider  the  n  *  n  matrix  X  partitioned  as  follows: 


where 


Hi  2 

>1 

^22 

(10.  1) 


X  j  j  is  Hj  y  n j  matrix 

X  . is  n  .  *  n„  matrix 

—12  1  i 

X,, .  is  n„  y  n .  matrix 

—21  2  1 

X.rj  is  n.j  x  n  matrix 

4.  a*  a* 

n,  +  n. .  =  rt  . 

1  2 


Assume  the  necessary  inverses  exist;  and  that  X  1  is  also  partitioned  as  in 
(10.  1).  Then 


1 


lu  +  xW 


-x-; 


-l 


( 10.  2) 


where 


r'- 


■so 


( 10.  3) 


From  (10.  2),  the  following  is  obtained. 


■  *12*a  —2d' 


- 1 


-  1 


.-1 


*u  +  *u*i  :!<**! ’**,*,,*«>  -2i -n 


with  the  special  ease 


<I+  ^)-1  =I‘<1+^"1)"1 


Other  useful  formulae  are 


**t*l  * ,lct  [  xn  -•If;ix'’x21]liet[x22i 
tr  I  X_1  -H  [XM  ]  +tr[X22]  . 


If  Y  is  also  part  it  ioned  as  in  (iO.  I),  then 


X  .  _ 

Y,  . 

+  X  Y 

!  X ,  Y 

+  X .  .  Y 

-1 1 

—  1 1 

-12  -21 

.  -11-12 

1 

1 

« 

-12-22 

X  ,, 

Y. 

+  X  ,  y 

1 

1 

;  x  ,  y 

+  X  .  Y 

-21 

-11 

—2  2  —2  1 

:  -21  -12 

—22.  —2,2. 

Hi 


(10. 4) 

(10.5) 

(10. ft) 

(10.  7) 


(10.  H) 


TABLE  OF  GRADIENTS 


ax  tr f -1  “I 

izu-i*x)  * A» 

^  ti- 1  A  X'  ]  =  A 
~  tr[  A  X  B]  =  A'B' 

~  tr  [  A  X'B]  =  B  A 

~,U-l  A  X  ]  =  A 
-Jp  tr  f  A  X  ']  =  A' 

~ -,!»•[  A  X  B]  =  B  A 

tr  [  A  X'B]  =  A'B' 

~U-[XX)  =  2X' 

~ -  tr  |X  Xr]  =  2  X 
~tr[Xn]  =  n<Xn'V 


B2 


TABLH  OF  GKADIUNTS  (Continued) 


(n- 1  i  / 

])  X1  ax’1'1’1 

i=0  ' 

14.  ~  tr[A  X  HX]  =  A'X'B'  +  B'X'A' 

15.  ~  tr  [  A  X  13  X']  =  A'X  B'  +  A  X  B 

o  x  X 

K,  tr  l  u-]«,- 

17.  tr  [  X * 1  ]  =  -(X'V1)'  «  -  <x'2)' 

18.  ~  tr  l  A  X ** 1 H  1  =  -(X_1M  A  X_V 

ll>.  Uut[X]  *  (dot  |  XI)  (X"V 

20.  1«>*.  dui  l  X  ]  -  <x‘V 

21.  ilct  |  A  X  1!  ]  =  (det  |  A  X  B])  (X~  V 

22.  tlfl  t  X#  1  ~  =  (dul|Xj)  (X'1)' 

25.  Out  1  XU  ]  =  n(ilul  |  X  ]  )U  (X* 1 )' 
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